47 research outputs found

    Formal Verification of an SSA-based Middle-end for CompCert

    Get PDF
    CompCert is a formally verified compiler that generates compact and efficient PowerPC, ARM and x86 code for a large and realistic subset of the C language. However, CompCert foregoes using Static Single Assignment (SSA), an intermediate representation that allows for writing simpler and faster optimizers, and that is used by many compilers. In fact, it has remained an open problem to verify formally an SSA-based compiler middle-end. We report on a formally verified, SSA-based, middle-end for CompCert. Our middle-end performs conversion from CompCert intermediate form to SSA form, optimization of SSA programs, including Global Value Numbering, and transforming out of SSA to intermediate form. In addition to provide the first formally verified SSA-based middle-end, we address two problems raised by Leroy in 2009: giving a simple and intuitive formal semantics to SSA, and leveraging the global properties of SSA to reason locally about program optimizations

    A Verified Information-Flow Architecture

    Get PDF
    SAFE is a clean-slate design for a highly secure computer system, with pervasive mechanisms for tracking and limiting information flows. At the lowest level, the SAFE hardware supports fine-grained programmable tags, with efficient and flexible propagation and combination of tags as instructions are executed. The operating system virtualizes these generic facilities to present an information-flow abstract machine that allows user programs to label sensitive data with rich confidentiality policies. We present a formal, machine-checked model of the key hardware and software mechanisms used to dynamically control information flow in SAFE and an end-to-end proof of noninterference for this model. We use a refinement proof methodology to propagate the noninterference property of the abstract machine down to the concrete machine level. We use an intermediate layer in the refinement chain that factors out the details of the information-flow control policy and devise a code generator for compiling such information-flow policies into low-level monitor code. Finally, we verify the correctness of this generator using a dedicated Hoare logic that abstracts from low-level machine instructions into a reusable set of verified structured code generators

    Fondements sémantiques des représentations intermédiaires de programmes

    No full text
    An end-to-end guarantee of software correctness by formal verification must consider two sources of bugs. First, the verification tool must be correct. Second, programs are often verified at the source level, before being compiled. Hence, compilers should also be trustworthy. Verifiers and compilers' complexity is increasing. To simplify code analysis and manipulation, these tools rely on intermediate representations (IR) of programs, that provide structural and semantic properties. This thesis gives a formal, semantic account on IRs, so that they can also be leveraged in the formal proof of such tools. We first study a register-based IR of Java bytecode used in compilers and verifiers. We specify the IR generation by a semantic theorem stating what the transformation preserves, e.g. object initialization or exceptions, but also what it modifies and how, e.g. object allocation. We implement this IR in Sawja, a Java static analysis toolbench. Then, we study the Static Single Assignment (SSA) form, an IR widely used in modern compilers and verifiers. We implement and prove in Coq an SSA middle-end for the CompCert C compiler. For the proof of SSA optimizations, we identify a key semantic property of SSA, allowing for equational reasoning. Finally, we study the semantics of concurrent Java IRs. Due to instruction reorderings performed by the compiler and the hardware, the current definition of the Java Memory Model (JMM) is complex, and unfortunately formally flawed. Targetting x86 architectures, we identify a subset of the JMM that is intuitive and tractable in formal proofs. We characterize the reorderings it allows, and factor out a proof common to the IRs of a compiler.La vérification formelle de programme n'apporte pas de garantie complète si l'outil de vérification est incorrect. Et, si un programme est vérifié au niveau source, le compilateur pourrait introduire des bugs. Les compilateurs et vérifieurs actuels sont complexes. Pour simplifier l'analyse et la transformation de code, ils utilisent des représentations intermédiaires (IR) de programme, qui ont de fortes propriétés structurelles et sémantiques. Cette thèse étudie d'un point de vue sémantique et formel les IRs, afin de faciliter la preuve de ces outils. Nous étudions d'abord une IR basée registre du bytecode Java. Nous prouvons un théorème sur sa génération, explicitant ce que la transformation préserve (l'initialisation d'objet, les exceptions) et ce qu'elle modifie et comment (l'ordre d'allocation). Nous implantons l'IR dans Sawja, un outil de développement d'analyses statiques de Java. Nous étudions aussi la forme SSA, une IR au coeur des compilateurs et vérifieurs modernes. Nous implantons et prouvons en Coq un middle-end SSA pour le compilateur C CompCert. Pour la preuve des optimisations, nous prouvons un invariant sémantique de SSA clé pour le raisonnement équationnel. Enfin, nous étudions la sémantique des IRs de Java concurrent. La définition actuelle du Java Memory Model (JMM) autorise les optimisations aggressives des compilateurs et des architectures parallèles. Complexe, elle est formellement cassée. Ciblant les architectures x86, nous proposons un sous-ensemble du JMM intuitif et adapté à la preuve formelle. Nous le caractérisons par ses réordonnancements, et factorisons cette preuve sur les IRs d'un compilateur

    JFLA 2024 - 35es Journées Francophones des Langages Applicatifs

    No full text
    National audienc

    Fondements sémantiques des représentations intermédiaires de programmes

    No full text
    La vérification formelle de programme n'apporte pas de garantie complète si l'outil de vérification est incorrect. Et, si un programme est vérifié au niveau source, le compilateur pourrait introduire des bugs. Les compilateurs et vérifieurs actuels sont complexes. Pour simplifier l'analyse et la transformation de code, ils utilisent des représentations intermédiaires (IR) de programme, qui ont de fortes propriétés structurelles et sémantiques. Cette thèse étudie d'un point de vue sémantique et formel les IRs, afin de faciliter la preuve de ces outils. Nous étudions d'abord une IR basée registre du bytecode Java. Nous prouvons un théorème sur sa génération, explicitant ce que la transformation préserve (l'initialisation d'objet, les exceptions) et ce qu'elle modifie et comment (l'ordre d'allocation). Nous implantons l'IR dans Sawja, un outil de développement d'analyses statiques de Java. Nous étudions aussi la forme SSA, une IR au coeur des compilateurs et vérifieurs modernes. Nous implantons et prouvons en Coq un middle-end SSA pour le compilateur C CompCert. Pour la preuve des optimisations, nous prouvons un invariant sémantique de SSA clé pour le raisonnement équationnel. Enfin, nous étudions la sémantique des IRs de Java concurrent. La définition actuelle du Java Memory Model (JMM) autorise les optimisations aggressives des compilateurs et des architectures parallèles. Complexe, elle est formellement cassée. Ciblant les architectures x86, nous proposons un sous-ensemble du JMM intuitif et adapté à la preuve formelle. Nous le caractérisons par ses réordonnancements, et factorisons cette preuve sur les IRs d'un compilateur.An end-to-end guarantee of software correctness by formal verification must consider two sources of bugs. First, the verification tool must be correct. Second, programs are often verified at the source level, before being compiled. Hence, compilers should also be trustworthy. Verifiers and compilers' complexity is increasing. To simplify code analysis and manipulation, these tools rely on intermediate representations (IR) of programs, that provide structural and semantic properties. This thesis gives a formal, semantic account on IRs, so that they can also be leveraged in the formal proof of such tools. We first study a register-based IR of Java bytecode used in compilers and verifiers. We specify the IR generation by a semantic theorem stating what the transformation preserves, e.g. object initialization or exceptions, but also what it modifies and how, e.g. object allocation. We implement this IR in Sawja, a Java static analysis toolbench. Then, we study the Static Single Assignment (SSA) form, an IR widely used in modern compilers and verifiers. We implement and prove in Coq an SSA middle-end for the CompCert C compiler. For the proof of SSA optimizations, we identify a key semantic property of SSA, allowing for equational reasoning. Finally, we study the semantics of concurrent Java IRs. Due to instruction reorderings performed by the compiler and the hardware, the current definition of the Java Memory Model (JMM) is complex, and unfortunately formally flawed. Targetting x86 architectures, we identify a subset of the JMM that is intuitive and tractable in formal proofs. We characterize the reorderings it allows, and factor out a proof common to the IRs of a compiler.CACHAN-ENS (940162301) / SudocSudocFranceF

    Mechanised Semantics for Gated Static Single Assignment

    No full text
    International audienc

    Validating Dominator Trees for a Fast, Verified Dominance Test

    Get PDF
    International audienceThe problem of computing dominators in a control flow graph is central to numerous modern compiler optimizations. Many efficient algorithms have been proposed in the litterature, but mechanizing the correctness of the most sophisticated algorithms is still considered as too hard problems, and to this date, verified compilers use less optimized implementations. In contrast, production compilers, like GCC or LLVM, implement the classic, efficient Lengauer-Tarjan algorithm [12], to compute dominator trees. And subsequent optimization phases can then determine whether a CFG node dominates another node in constant time by using their respective depth-first search numbers in the dominator tree. In this work, we aim at integrating such techniques in verified compilers. We present a formally verified validator of untrusted dominator trees, on top of which we implement and prove correct a fast dominance test following these principles. We conduct our formal development in the Coq proof assistant, and integrate it in the middle-end of the CompCertSSA verified compiler. We also provide experimental results showing performance improvement over previous formalizations
    corecore